Skip to content

feat(scripts): remote devcontainer orchestration via just recipe (#70)#166

Open
gerchowl wants to merge 286 commits into
devfrom
feature/70-remote-devc-orchestration
Open

feat(scripts): remote devcontainer orchestration via just recipe (#70)#166
gerchowl wants to merge 286 commits into
devfrom
feature/70-remote-devc-orchestration

Conversation

@gerchowl

@gerchowl gerchowl commented Feb 23, 2026

Copy link
Copy Markdown
Contributor

Description

Implements remote devcontainer orchestration: a single command (just remote-devc <host> or devc-remote.sh) that provisions a devcontainer on a remote host and connects your IDE to it. This enables developers to spin up devcontainers on powerful remote machines (GPU servers, cloud VMs) from their local terminal without manual SSH and compose steps.

Key capabilities

  • Core orchestration — SSH preflight, container state detection (fresh/running/stopped), compose lifecycle, IDE launch
  • gh:org/repo[:branch] targets — Clone a GitHub repo on the remote host and start its devcontainer in one command
  • --bootstrap flag — One-time remote host setup (config file, GHCR auth, image build)
  • --force flag — Auto-push unpushed commits before deploying; guards against deploying stale code
  • --open ssh|cursor|code|none — IDE-agnostic connection modes with auto-detection
  • Tailscale SSH integration — Ephemeral auth key generation via OAuth API, TUN device injection, peer wait polling
  • Claude Code CLI injection — Subscription OAuth token forwarding for AI-assisted development in containers
  • Container lifecycle execution — Runs post-create/post-start scripts inside the container after compose up
  • Compose file parsing — Reads dockerComposeFile from devcontainer.json, builds correct -f flags

Implementation

Component Purpose
scripts/devc-remote.sh Bash orchestrator: parse_args, check_ssh, remote_preflight, inject_tailscale_key, inject_claude_auth, remote_compose_up, run_container_lifecycle, open_editor
scripts/devc_remote_uri.py Python helper for Cursor/VS Code nested authority URIs (hex-encoded devcontainer specs)
justfile.base remote-devc recipe wrapping devc-remote.sh with local git state auto-detection
setup-tailscale.sh Opt-in Tailscale SSH daemon (install/start subcommands, lifecycle hooks)
setup-claude.sh Opt-in Claude Code CLI (install/start subcommands, lifecycle hooks)

Type of Change

  • feat -- New feature

Issues

Closes #152, #153, #221, #230, #231, #232, #235, #236, #243
Refs: #70, #208, #246

Testing

Manual Integration Test Results (#243)

36/39 items verified on a real remote host. Remaining 3 edge cases (low disk, missing compose, missing runtime) covered by unit tests.

Full test matrix (click to expand)

1. Core orchestration

  • devc-remote.sh myserver:~/Projects/fd5 — SSH, preflight, compose up
  • Re-run with container already running — skips compose up, opens editor
  • Re-run with container stopped — restarts and opens
  • --open none — infra only, no IDE launch
  • --open ssh — waits for Tailscale, prints hostname
  • --open code — opens VS Code instead of Cursor
  • --yes flag — auto-accepts prompts (reuse running container)

2. Tailscale SSH integration (#208, #230)

  • With TS_CLIENT_ID + TS_CLIENT_SECRET set — generates ephemeral key, injects into remote compose
  • Container joins tailnet after compose up
  • --open ssh mode — polls tailscale status, prints hostname when ready
  • Without TS env vars — silently skips (no error)

3. Claude Code CLI (#70)

  • With CLAUDE_CODE_OAUTH_TOKEN set — injects token into remote compose
  • setup-claude.sh install inside container — installs CLI, creates claude user
  • claude wrapper auto-switches to non-root user when run as root
  • setup-claude.sh start — refreshes workspace permissions
  • Without token — silently skips (no error)

4. Container lifecycle

  • Fresh container — runs post-create.sh then post-start.sh inside container
  • Existing running container — runs only post-start.sh (skips post-create)
  • Lifecycle scripts not present — skips gracefully with log message

5. --bootstrap (#235)

  • First run on clean host — prompts for projects_dir, creates config, forwards GHCR auth, clones devcontainer repo, builds image
  • --bootstrap --yes — uses defaults without prompting
  • Re-run — reads existing config, skips prompts, pulls latest, rebuilds
  • GHCR auth forwarding — podman credentials or GHCR_TOKEN copied to remote

6. gh:org/repo[:branch] (#236)

  • devc-remote.sh myserver gh:vig-os/fd5 — clones to ~/Projects/fd5, starts devcontainer
  • devc-remote.sh myserver gh:vig-os/fd5:feature/my-branch — clones and checks out branch
  • Re-run with repo already cloned — fetches, doesn't re-clone
  • devc-remote.sh myserver:~/custom/path gh:vig-os/fd5 — overrides clone location
  • Branch switch on existing clone — checks out new branch

7. Compose file parsing

  • read_compose_files() correctly reads dockerComposeFile array from devcontainer.json
  • compose_cmd_with_files() builds correct -f flags
  • Works with single file string and multi-file array

8. Edge cases

  • Low disk space warning (<2GB) — requires special host (covered by unit test)
  • Remote host without compose — requires special host (covered by unit test)
  • Remote host without container runtime — requires special host (covered by unit test)
  • SSH connection failure — clear error message
  • macOS remote host — not tested (covered by unit test)

Bugs found and fixed during testing

  • SSH drops empty args / expands ~ in remote_clone_project — fixed with sentinel values (17ca79f)
  • GHCR auth forwarding moved from bootstrap-only to every deploy (9280224)
  • Tailscale SSH required real TUN device, not userspace networking (c209f1d)
  • Tailscale key regenerated on every deploy to avoid expired ephemeral keys (0b0bcef)
  • ~/.local/bin added to PATH for SSH compose commands (15120fb)
  • Reverted unnecessary podman-compose preference logic (ea3af49)
  • Pre-flight check for stale local Tailscale daemon (49c7a4e)

Changelog Entry

See CHANGELOG.md ## Unreleased section — fully up to date.

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have updated the documentation accordingly
  • I have updated CHANGELOG.md in the [Unreleased] section
  • My changes generate no new warnings or errors
  • I have added tests that prove my feature works
  • New and existing unit tests pass locally with my changes
  • Manual integration tests pass (chore: manual integration tests for remote devcontainer features #243)

@gerchowl gerchowl self-assigned this Feb 23, 2026
@gerchowl gerchowl requested a review from c-vigo as a code owner February 24, 2026 15:15
@c-vigo

c-vigo commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

This PR requires an update to assets/workspace/.devcontainer/README.md explaining the feature to users.

@c-vigo

c-vigo commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

Opened follow-up bug issue #202 for the host-dependent BATS failure (detect_editor_cli neither-path case).

Root cause: test assumes /usr/bin:/bin has no code, which is false on some hosts.

Planned fix in #202: isolate PATH with an empty temp dir and invoke /bin/bash under env -i so detection is deterministic.

c-vigo and others added 4 commits February 25, 2026 13:37
Pointing directly to the script file sometimes leads to execution problems.

Refs: #204
Update `post_create` and `post_attach` tests
Add missing `post_start` test

Refs: #204
## Description

Ensure devcontainer lifecycle hooks execute reliably in downstream
workspaces by invoking hook scripts through `/bin/bash` in
`devcontainer.json`. Add integration coverage for all three lifecycle
commands (`postCreate`, `postStart`, `postAttach`) so command format
regressions are caught by tests.

## Type of Change

- [ ] `feat` -- New feature
- [x] `fix` -- Bug fix
- [x] `docs` -- Documentation only
- [ ] `chore` -- Maintenance task (deps, config, etc.)
- [ ] `refactor` -- Code restructuring (no behavior change)
- [x] `test` -- Adding or updating tests
- [ ] `ci` -- CI/CD pipeline changes
- [ ] `build` -- Build system or dependency changes
- [ ] `revert` -- Reverts a previous commit
- [ ] `style` -- Code style (formatting, whitespace)

### Modifiers

- [ ] Breaking change (`!`) -- This change breaks backward compatibility

## Changes Made

- `assets/workspace/.devcontainer/devcontainer.json`
- Update `postCreateCommand`, `postStartCommand`, and
`postAttachCommand` to run scripts via `/bin/bash`.
- Keep existing script paths unchanged while making command execution
more robust.
- `tests/test_integration.py`
- Update expectations for `postAttachCommand` and `postCreateCommand` to
include `/bin/bash`.
- Add `test_devcontainer_json_post_start_command` to validate
`postStartCommand` uses the same bash-wrapped format.
- `CHANGELOG.md`
- Add an Unreleased `### Fixed` entry for issue `#204` describing the
lifecycle-command fix and user-visible error resolved.

## Changelog Entry

### Fixed
- **Devcontainer lifecycle commands fail in mock-up folders with crun
getcwd error**
([#204](#204))
- Run post-create, post-start, and post-attach commands via `/bin/bash`
in `devcontainer.json` for stable command resolution on attach
- Prevent attach-time failure where OCI runtime reports `getcwd: No such
file or directory`
  - Update tests in `test-integration.py`

## Testing

- [ ] Tests pass locally (`just test`)
- [ ] Manual testing performed (describe below)

### Manual Testing Details

Added/updated integration assertions in `tests/test_integration.py`:
- `postAttachCommand` expected value includes `/bin/bash`
- `postCreateCommand` expected value includes `/bin/bash`
- new `postStartCommand` assertion

## Checklist

- [x] My code follows the project's style guidelines
- [x] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly (edit
`docs/templates/`, then run `just docs`)
- [x] I have updated `CHANGELOG.md` in the `[Unreleased]` section (and
pasted the entry above)
- [x] My changes generate no new warnings or errors
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] New and existing unit tests pass locally with my changes
- [x] Any dependent changes have been merged and published

## Additional Notes

Closes #204

Refs: #204
@c-vigo

c-vigo commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

Opened follow-up bug issue #202 for the host-dependent BATS failure (detect_editor_cli neither-path case).

Root cause: test assumes /usr/bin:/bin has no code, which is false on some hosts.

Planned fix in #202: isolate PATH with an empty temp dir and invoke /bin/bash under env -i so detection is deterministic.

Fixed in #205

@c-vigo

c-vigo commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

Running from the host directly:

carlosvigo@vigolaptop:~/Documents/vigOS/tmp$ just devc-remote ksb
bash scripts/devc-remote.sh ksb
ℹ  Detecting local editor CLI...
✓  Using cursor
ℹ  Checking SSH connectivity to ksb...
✓  SSH connection OK
ℹ  Running pre-flight checks on ksb...
✗  No .devcontainer/ found in ~. Is this a devcontainer-enabled project?
error: Recipe `devc-remote` failed on line 382 with exit code 1

Running from inside the devcontainer:

root@0491bc9d9819:/workspace/tmp# just devc-remote ksb
bash scripts/devc-remote.sh ksb
ℹ  Detecting local editor CLI...
✓  Using cursor
ℹ  Checking SSH connectivity to ksb...
✗  Cannot connect to ksb. Check your SSH config and network.
error: Recipe `devc-remote` failed on line 382 with exit code 1

I assume I am supposed to run from the devcontainer, but my SSH configuration (and maybe also the credentials) are not being forwarded.

…tion' into bugfix/202-deterministic-detect-editor-cli-test
## Description

Make the `detect_editor_cli` negative-path BATS test deterministic across host environments where `/usr/bin/code` may be present.

## Type of Change

- [ ] `feat` -- New feature
- [x] `fix` -- Bug fix
- [ ] `docs` -- Documentation only
- [ ] `chore` -- Maintenance task (deps, config, etc.)
- [ ] `refactor` -- Code restructuring (no behavior change)
- [ ] `test` -- Adding or updating tests
- [ ] `ci` -- CI/CD pipeline changes
- [ ] `build` -- Build system or dependency changes
- [ ] `revert` -- Reverts a previous commit
- [ ] `style` -- Code style (formatting, whitespace)

### Modifiers

- [ ] Breaking change (`!`) -- This change breaks backward compatibility

## Changes Made

- `tests/bats/devc-remote.bats`
  - Replace `PATH="/usr/bin:/bin"` assumption with a temp empty PATH directory.
  - Execute via `/bin/bash "$DEVC_REMOTE"` under `env -i` to avoid shebang/PATH lookup side effects.
  - Clean up temporary directory at test end.

## Changelog Entry

No changelog needed: this is an internal test determinism fix with no user-facing behavior change.

## Testing

- [x] Tests pass locally (`just test`)
- [ ] Manual testing performed (describe below)

### Manual Testing Details

N/A

## Checklist

- [x] My code follows the project's style guidelines
- [x] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly (edit `docs/templates/`, then run `just docs`)
- [ ] I have updated `CHANGELOG.md` in the `[Unreleased]` section (and pasted the entry above)
- [x] My changes generate no new warnings or errors
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published

## Additional Notes

Validated with:
- `npx bats tests/bats/devc-remote.bats -f "detect_editor_cli fails when neither cursor nor code in PATH"`
- `npx bats tests/bats/devc-remote.bats`

Refs: #202
Added an import statement for '.devcontainer/justfile.worktree' to the main justfile, allowing for improved workspace configuration management.
…remote-devc-orchestration

# Conflicts:
#	CHANGELOG.md
#	CONTRIBUTE.md
#	README.md
#	scripts/manifest.toml
… to devcontainer.json (#207)

## Description

When a user's Cursor/VS Code settings configure `terminal.integrated.defaultProfile.linux` to a shell not present in the container image (e.g. `zsh`), the Agent chat shell fails with `forkpty(3) failed` and the extension host times out. This adds a `"terminal.integrated.defaultProfile.linux": "bash"` setting to the devcontainer.json template so the container always overrides the user's host-side preference.

## Type of Change

- [x] `fix` -- Bug fix

### Modifiers

- [ ] Breaking change (`!`) -- This change breaks backward compatibility

## Changes Made

- Add `"terminal.integrated.defaultProfile.linux": "bash"` to `assets/workspace/.devcontainer/devcontainer.json` settings
- Add BATS test verifying the setting is present in the template
- Update CHANGELOG.md

## Changelog Entry

### Fixed

- **Cursor Agent shell fails with forkpty(3) when host sets zsh as default terminal profile** ([#206](#206))
  - Add `terminal.integrated.defaultProfile.linux: "bash"` to devcontainer.json template settings
  - Prevents user's host-side shell preference from leaking into the container

## Testing

- [x] Tests pass locally (`npx bats tests/bats/init-workspace.bats`)
- [x] Manual testing performed (describe below)

### Manual Testing Details

- Verified the devcontainer.json template is valid JSON after the change
- Confirmed the new BATS test fails before the fix (RED) and passes after (GREEN)

## Checklist

- [x] My code follows the project's style guidelines
- [x] I have performed a self-review of my code
- [x] I have updated `CHANGELOG.md` in the `[Unreleased]` section (and pasted the entry above)
- [x] My changes generate no new warnings or errors
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes

## Additional Notes

The container image ships `bash` but not `zsh`. Users with `"terminal.integrated.defaultProfile.linux": "zsh"` in their global Cursor settings hit this silently. The devcontainer.json `customizations.vscode.settings` override ensures the container always uses `bash` regardless of the host-side preference.

Refs: #206
Opt-in Tailscale SSH setup for devcontainers. Both subcommands are
silent no-ops when TAILSCALE_AUTHKEY is unset.

Refs: #208
- Added `terminal.integrated.defaultProfile.linux: "bash"` to the devcontainer.json template to prevent host shell preferences from affecting the container.
- Updated CHANGELOG to reflect this change and its purpose.
- Added a test to verify the default terminal profile setting.

Refs: #206
gerchowl and others added 27 commits March 27, 2026 11:15
SSH non-login shells don't source .bashrc/.profile, so tools
installed to ~/.local/bin (like uv) are not found during bootstrap.

Refs: #70
…project files

setup-tailscale.sh: replace underscores with hyphens in auto-generated
hostname — DNS labels cannot contain underscores.

init-workspace.sh: add pyproject.toml, uv.lock, .python-version to
PRESERVE_FILES so --force upgrades don't overwrite project config.

Refs: #70
devc-remote.sh now reads CLAUDE_CODE_OAUTH_TOKEN from macOS keychain
(service: devc-remote) when not set as env var. First step toward
full secret resolution chain (#464).

Refs: #70
Move Tailscale OAuth credentials (TS_CLIENT_ID, TS_CLIENT_SECRET) to
the same keychain-fallback pattern as Claude OAuth token. All three
secrets now resolve at deploy time: env var → macOS keychain → skip.

No secrets need to be in shell profile env vars anymore.

Refs: #70
Extract sanitize_dns_label() helper and use it in wait_for_tailscale()
so the pattern matches the sanitized hostname from setup-tailscale.sh.

Refs: #70
Previously skipped compose up when container was already running,
causing injected secrets to never reach the container. Now always
runs compose up -d which is idempotent and auto-recreates only
when config (env vars, devices, etc.) has changed.

Refs: #70
uv sync failure (corrupt lockfile, version mismatch) was killing the
entire lifecycle script before Tailscale and Claude Code could install.
Sync is now non-fatal — a warning is printed but setup continues.

Refs: #70
compose up -d silently recreates containers when config changes,
but CONTAINER_FRESH stayed 0 because it only checked the pre-up
state. Now compares container ID before and after compose up to
detect recreates and run post-create lifecycle accordingly.

Refs: #70
podman compose prefixes output with >>>> banner lines which polluted
the container ID comparison, making recreated containers look identical.

Refs: #70
Containers often have clock skew causing apt-get update to fail with
"Release file not valid yet". Add fallback manual install with
Acquire::Check-Valid-Until=false. Also remove debug log line.

Refs: #70
…skew

apt-get update on all repos fails when container clock is skewed.
Install directly from Tailscale repo with Dir::Etc::sourcelist
to only update that single source, bypassing clock-skew on system repos.

Refs: #70
Same issue as Tailscale: container clock skew breaks apt-get update
on system repos. Install Node.js from nodesource repo only.

Refs: #70
The nodesource setup_lts.x script runs apt-get update on all repos,
failing with clock skew. Add the repo GPG key and source list manually,
then update only that repo.

Refs: #70
gpg refuses to overwrite an existing file without --yes, causing
the nodesource key import to fail on container re-creation.

Refs: #70
nodesource nodejs depends on system python3, so we need all repos
updated with Check-Valid-Until=false, not just the nodesource repo.

Refs: #70
apt returns 100 when any repo has clock issues even with
Check-Valid-Until=false. The repos we need still update successfully
so ignore the exit code.

Refs: #70
…e user

Mirrors local dev environment aliases. cl=claude, cld=claude
--dangerously-skip-permissions. Added to both claude user .bashrc
and root .bashrc for consistent DX regardless of SSH user.

Refs: #70
Tailscale SSH sessions don't inherit compose env vars. The wrapper
now reads CLAUDE_CODE_OAUTH_TOKEN from /proc/1/environ when not
in the current environment, matching the claude user .bashrc pattern.

Refs: #70
runuser without --pty doesn't allocate a terminal, causing Claude
Code's interactive TUI to exit immediately. Add --pty flag.

Refs: #70
Copies ~/.claude/{CLAUDE.md,settings.json,commands/} into the
container's claude user home after lifecycle scripts run. Gives
the remote Claude Code the same global instructions, permissions,
and custom commands as the local dev environment.

Refs: #70
Pre-create .claude.json with hasCompletedOnboarding=true so the
interactive TUI doesn't show the login method selection screen
when CLAUDE_CODE_OAUTH_TOKEN is already set.

Refs: #70
hasCompletedOnboarding alone was not enough — Claude Code also
checks hasCompletedAuthFlow before skipping the interactive
login method selection screen.

Refs: #70
setup-claude.sh now creates settings.json with additionalDirectories
pointing to the workspace project and skipDangerousModePermissionPrompt
so Claude Code trusts the workspace without interactive prompts.

Refs: #70
Instead of pre-configuring additionalDirectories, pass the current
working directory via --add-dir so Claude Code always trusts wherever
it's launched from.

Refs: #70
… settings

Claude Code checks hasTrustDialogAccepted in two places: the global
.claude.json (projects dict keyed by path) and the per-project
settings.json. Set both to avoid the interactive trust prompt on
first run.

Refs: #70
Refs: #70

apt block extension: expect/neovim/ripgrep/fd-find/bat/fzf
  + symlinks /usr/local/bin/{fd,bat} so Debian fdfind/batcat names
    do not trip up agents.

Binary release downloads: eza/delta/lazygit/zoxide/starship/charm-freeze.

Claude Code baked in:
  - install via official installer (3-attempt retry, mirrors cursor-agent)
  - symlinked to /usr/local/bin/claude
  - ENV IS_SANDBOX=1 lets --dangerously-skip-permissions bypass the
    uid-0 refusal cleanly (container is the trust boundary)
  - aliases cc=claude, cld='claude --dangerously-skip-permissions'

Verified: image builds (linux/arm64), all 14 added tools resolve on PATH,
IS_SANDBOX=1 set, aliases written to /root/.bashrc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants